How to overtake Google in MT quality - the Baltic case

نویسنده

  • Andrejs Vasiljevs
چکیده

Motivation of the language technology company Tilde is to improve quality of machine translation for lesser resourced languages such as the languages of Baltic countries. Generic MT solutions like Google Translate perform poorly for these complex languages. To compensate the shortage of training data and to deal with rich morphology we are applying different approaches in combining statistical methods with linguistic rules. We will present the strategies applied and the results of various experiments. We will discuss application of the production systems that show significantly better translation quality comparing to the Google Translate. We will also outline how this work contributes to creation of the European infrastructure for automated translation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Translation Quality of Google Translate: With a Concentration on Adjectives

Translation, whose first traces date back at least to 3000 BC (Newmark, 1988), has always been considered time-consuming and labor-consuming. In view of this, experts have made numerous efforts to develop some mechanical systems which can reduce part of this time and labor. The advancement of computers in the second half of the twentieth century paved the ground for the invention of machine tra...

متن کامل

Machine translation for e-Government – the Baltic case

This paper presents a case study about the development of MT systems for two Baltic governments. The governments of Latvia and Lithuania presented Tilde with a need to expand their communication to reach multilingual citizens. In order to meet this need, Tilde collected a vast amount of domain-specific data and trained MT system to produce high-quality translation. In the process, Tilde identif...

متن کامل

Improving SMT with Morphology Knowledge for Baltic Languages

In the recent years, several machine translation systems have been built for the Baltic languages. Besides Google and Microsoft machine translation engines and research experiments with statistical MT for Latvian [1] and Lithuanian, there are both English-Latvian [2] and English-Lithuanian [3] rulebased MT systems available. Both Latvian and Lithuanian are morphologically rich languages with qu...

متن کامل

Real-world challenges in application of MT for localization: the Baltic case

In this paper we share our experience from implementing machine translation in localization into relatively small languages of the three Baltic countries – Latvian, Lithuanian, and Estonian. We describe our approach in improving terminology translation and consistency by preprocessing of the source text and performing term integration. We present results of a formal evaluation of MT impact on t...

متن کامل

Ecological-Economic Fisheries Management Advice—Quantification of Potential Benefits for the Case of the Eastern Baltic COD Fishery

Citation: Voss R, Quaas MF, Stoeven MT, Schmidt JO, Tomczak MT and Möllmann C (2017) Ecological-Economic Fisheries Management Advice—Quantification of Potential Benefits for the Case of the Eastern Baltic COD Fishery. Front. Mar. Sci. 4:209. doi: 10.3389/fmars.2017.00209 Ecological-Economic Fisheries Management Advice—Quantification of Potential Benefits for the Case of the Eastern Baltic COD F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014